Programmatic Moti f s and Genetic Programming

نویسندگان

  • John R. Koza
  • David Andre
چکیده

As newly sequenced proteins are deposited into the world's ever-growing archive of protein sequences, they are typically immediately tested by various computerized algorithms for clues as to their biological structure and function. One question about a new protein involves its cellular location – that is, where the protein resides in a living organism (extracellular, intracellular, etc.). A 1997 paper reported a human-created five-way algorithm for cellular location created using statistical techniques with 76% accuracy. This paper describes a two-way classification algorithm that was evolved using genetic programming with 83% accuracy for determining whether a protein is extracellular. Unlike the statistical calculation, the genetically evolved algorithm employs a large and varied arsenal of computational capabilities, including arithmetic functions, conditional operations, subroutines, iterations, memory, data structures, set-creating operations, macro definitions, recursion, etc. The genetically evolved classification algorithm can be viewed as an extension (which we call a programmatic motif) of the conventional notion of a protein motif. The genetically evolved program constitutes an instance of an evolutionary computation technique producing a solution to a problem that is competitive with that produced using human intelligence. 1 . Background and introduction Proteins are responsible for such a wide variety of biological structures and functions that it can be said that the structure and functions of living organisms are primarily determined by proteins (Stryer 1995). Proteins are composed of a chain of amino acid residues in a linear arrangement. The same 20 amino acid residues (denoted by the letters A, C , D, E, F, G , H, I, K, L, M , N, P, Q, R, S, T, V, W , and Y) are used for virtually all proteins of all species on earth. Thus, a protein can be viewed as a linear sequence over a 20-letter alphabet (called the primary structure of the protein). The length of protein sequences vary widely, with the average being about 300. For example, the primary structure of bovine pancreatic trypsin inhibitor (BPTI) contains only 58 amino acid residues, as shown below: RPDFCLEPPY TGPCKARIIR YFYNAKAGLC QTFVYGGCRA KRNNFKSAED 50

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Programmatic Compression of Images and Sound

The importance of digital data compression in the future media arena cannot be overestimated. A novel approach to data compression is built on Genetic Programming. This technique has been referred to as "programmatic compression". In this paper we apply a variant of programmatic compression to the compression of bitmap images and sampled digital sound. The work presented here constitutes the rs...

متن کامل

Global Supply Chain Management under Carbon Emission Trading Program Using Mixed Integer Programming and Genetic Algorithm

In this paper, the transportation problem under the carbon emission trading program ismodelled by mathematical programming and genetic algorithm. Since green supply chain issuesbecome important and new legislations are taken into account, carbon emissions costs are included inthe total costs of the supply chain. The optimisation model has the ability to minimise the total costsand provides the ...

متن کامل

Estimation of Discharge over the Submerged Compound Sharp-Crested Weir using Artificial Neural Networks and Genetic Programming

Truncated sharp crested weirs are used to measure flow rate and control upstream water surface in irrigation canals and laboratory flumes. The main advantages of such weirs are ease of construction and capability of measuring a wide range of flows with sufficient accuracy. Artificial neural networks (ANNs) and genetic programming (GP) have recently been used for estimation of hydraulic data. In...

متن کامل

Programmatic Compression of Natural Video

The use of digital video is increasing day by day. In the field of Genetic Programming a new approach called “programmatic compression” has been suggested for data compression. In this paper we describe how this technique can be applied to natural video. A programme generating intermediate frames of a video sequence is evolved where each frame is composed by a series of transformed regions from...

متن کامل

Evolving Compression Preprocessors With Genetic Programming

We present a new approach for applying genetic programming to lossless data compression. Unlike programmatic compression the evolved programs are preprocessors. These preprocessors aim at enhancing the compression rate of the given data by transforming it. The entropy based tness function is both fast and independent of the type of information being processed. The obtained results are encouragi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998